Constructing a Temporal Relation Identification System of Chinese based on Dependency Structure Analysis
نویسندگان
چکیده
"Temporal information (Time)" has been a subject of study in many disciplines particularly in philosophy, physics, and is an important dimension of natural language processing. The temporal information includes temporal expressions, event and temporal relations. There are many researches dealing with the temporal expressions and event expressions. However, researches on temporal relation identification and the construction of temporal relation annotated corpus are still limited. There is a well-known temporal information annotated guideline for English, TimeML (Pustejovsky, 2006 [59]) . However, there is no such a research that focuses on this in Chinese. Our research is the first work of the temporal relation identification between verbs in Chinese texts. In this research, we propose a temporal information annotation guideline for Chinese and a machine learning-based temporal relation identification method. Following the observation of our investigation, the distribution of events and temporal expressions is un-balance. The temporal information processing includes two independent tasks: anchoring the temporal expressions on a timeline and ordering the events to temporal order. Our research focuses on ordering the events, which is to identify the temporal relations between events. Because identifying the nominal event is difficult, we limit the events to the verbs in articles. The proposed annotation guideline is based on the TimeML language. We newly introduce dependency structure information to limit target temporal relations. The proposed method reduces the manual 2 Doctoral Dissertation, Department of Information Process, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD0561040, February 7, 2008. vi efforts in constructing the annotated corpus. To annotate temporal relations of all combinations of events requires n(n-1)/2 manual judges. Our proposed method requires at most 3n manual judges. While the dependency structure based attributes reduce manual annotation costs, the limited relations preserve the majority of the temporal relations. We use a syntactic parsed corpus Penn Chinese treebank as the original data for annotating a basic annotated corpus. For using the dependency structure in temporal relation identification, we first construct a dependency analyzer for Chinese and combine it into the temporal relation annotating system. The accuracy of the dependency analyzer is 88% for word dependency analysis and this is better than existed Chinese dependency analyzer. The process of temporal relation identification includes following steps: to analyze the dependency structure, to analyze the temporal relation attributes of events and to extend the relation using the inference rule. We define events as those expressed by verbs and define the temporal relation types of event pairs which include the adjacent event pairs, the headmodifier event pairs and the sibling event pairs. These relations include most meaningful information, and we extend these relations using the inference rules to acquire long distance relations. We train a machine learner with our temporal relation annotated corpus to construct the temporal relation identifying system. Support Vector Machine is used as the machine learner in this system. We survey the coverage of our system with a small corpus. The accuracies of the annotating experiments are 68%~71% for annotating the temporal relation attributes. The result shows that our proposed system covers about 53% of temporal relations of all possible event pairs.
منابع مشابه
The Rhetorical - Aesthetic Approach to Constructing the Relation between Images and Visual Inventions with Global Politics
Images and photos play an important role in our understanding of domestic and international events. Today we are living in the age of the visualization of politics. The images are vague, rhetorical, and aesthetic components of political and social phenomena and can give them a beautiful or detestable structure. In the digital age, images in and of themselves can define our structure and vision ...
متن کاملNAIST.Japan: Temporal Relation Identification Using Dependency Parsed Tree
In this paper, we attempt to use a sequence labeling model with features from dependency parsed tree for temporal relation identification. In the sequence labeling model, the relations of contextual pairs can be used as features for relation identification of the current pair. Head-modifier relations between pairs of words within one sentence can be also used as the features. In our preliminary...
متن کاملUse of Event Types for Temporal Relation Identification in Chinese Text
This paper investigates a machine learning approach for identification of temporal relation between events in Chinese text. We proposed a temporal relation annotation guideline (Cheng, 2007) and constructed temporal information annotated corpora. However, our previous criteria did not deal with various uses of Chinese verbs. For supplementing the previous version of our criteria, we introduce a...
متن کاملتبدیل خودکار درختبانک وابستگی فارسی به درختبانک سازهای
There are two major types of treebanks: dependency-based and constituency-based. Both of them have applications in natural language processing and computational linguistics. Several dependency treebanks have been developed for Persian. However, there is no available big size constituency treebank for this language. In this paper, we aim to propose an algorithm for automatic conversion of a depe...
متن کاملChinese Semantic Role Labeling with Dependency-Driven Constituent Parse Tree Structure
This paper explores a tree kernel-based method for nominal semantic role labeling (SRL). In particular, a new dependency-driven constituent parse tree (D-CPT) structure is proposed to better represent the dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT. In this way, D-CPT not only keeps the dependency relationship informatio...
متن کامل